Search CORE

158 research outputs found

Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images

Author: Fink Gernot A.
Gondal Waleed M.
Grzeszick René
Hirsch Michael
Köhler Jan M.
Publication venue
Publication date: 01/01/2017
Field of study

Convolutional neural networks (CNNs) show impressive performance for image classification and detection, extending heavily to the medical image domain. Nevertheless, medical experts are sceptical in these predictions as the nonlinear multilayer structure resulting in a classification outcome is not directly graspable. Recently, approaches have been shown which help the user to understand the discriminative regions within an image which are decisive for the CNN to conclude to a certain class. Although these approaches could help to build trust in the CNNs predictions, they are only slightly shown to work with medical image data which often poses a challenge as the decision for a class relies on different lesion areas scattered around the entire image. Using the DiaretDB1 dataset, we show that on retina images different lesion areas fundamental for diabetic retinopathy are detected on an image level with high accuracy, comparable or exceeding supervised methods. On lesion level, we achieve few false positives with high sensitivity, though, the network is solely trained on image-level labels which do not include information about existing lesions. Classifying between diseased and healthy images, we achieve an AUC of 0.954 on the DiaretDB1.Comment: Accepted in Proc. IEEE International Conference on Image Processing (ICIP), 201

arXiv.org e-Print Archive

MPG.PuRe

Focusing computational visual attention in multi-modal human-robot interaction

Author: Boris Schauerte
Gernot A. Fink
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Identifying verbally and non-verbally referred-to objects is an im-portant aspect of human-robot interaction. Most importantly, it is essential to achieve a joint focus of attention and, thus, a natural interaction behavior. In this contribution, we introduce a saliency-based model that reflects how multi-modal referring acts influence the visual search, i.e. the task to find a specific object in a scene. Therefore, we combine positional information obtained from point-ing gestures with contextual knowledge about the visual appear-ance of the referred-to object obtained from language. The avail-able information is then integrated into a biologically-motivated saliency model that forms the basis for visual search. We prove the feasibility of the proposed approach by presenting the results of an experimental evaluation

CiteSeerX

Crossref